Detecting coevolving amino acid sites using Bayesian mutational mapping
نویسندگان
چکیده
MOTIVATION The evolution of protein sequences is constrained by complex interactions between amino acid residues. Because harmful substitutions may be compensated for by other substitutions at neighboring sites, residues can coevolve. We describe a Bayesian phylogenetic approach to the detection of coevolving residues in protein families. This method, Bayesian mutational mapping (BMM), assigns mutations to the branches of the evolutionary tree stochastically, and then test statistics are calculated to determine whether a coevolutionary signal exists in the mapping. Posterior predictive P-values provide an estimate of significance, and specificity is maintained by integrating over uncertainty in the estimation of the tree topology, branch lengths and substitution rates. A coevolutionary Markov model for codon substitution is also described, and this model is used as the basis of several test statistics. RESULTS Results on simulated coevolutionary data indicate that the BMM method can successfully detect nearly all coevolving sites when the model has been correctly specified, and that non-parametric statistics such as mutual information are generally less powerful than parametric statistics. On a dataset of eukaryotic proteins from the phosphoglycerate kinase (PGK) family, interdomain site contacts yield a significantly greater coevolutionary signal than interdomain non-contacts, an indication that the method provides information about interacting sites. Failure to account for the heterogeneity in rates across sites in PGK resulted in a less discriminating test, yielding a marked increase in the number of reported positives at both contact and non-contact sites. SUPPLEMENTARY INFORMATION http://www.dimmic.net/supplement/
منابع مشابه
Identification of Coevolving Residues and Coevolution Potentials Emphasizing Structure, Bond Formation and Catalytic Coordination in Protein Evolution
The structure and function of a protein is dependent on coordinated interactions between its residues. The selective pressures associated with a mutation at one site should therefore depend on the amino acid identity of interacting sites. Mutual information has previously been applied to multiple sequence alignments as a means of detecting coevolutionary interactions. Here, we introduce a refin...
متن کاملSpidermonkey: rapid detection of co-evolving sites using Bayesian graphical models
UNLABELLED Spidermonkey is a new component of the Datamonkey suite of phylogenetic tools that provides methods for detecting coevolving sites from a multiple alignment of homologous nucleotide or amino acid sequences. It reconstructs the substitution history of the alignment by maximum likelihood-based phylogenetic methods, and then analyzes the joint distribution of substitution events using B...
متن کاملA novel method for detecting intramolecular coevolution: adding a further dimension to selective constraints analyses.
Protein evolution depends on intramolecular coevolutionary networks whose complexity is proportional to the underlying functional and structural interactions among sites. Here we present a novel approach that vastly improves the sensitivity of previous methods for detecting coevolution through a weighted comparison of divergence between amino acid sites. The analysis of the HIV-1 Gag protein de...
متن کاملStatistical properties of the methods for detecting positively selected amino acid sites.
Parsimony and Bayesian methods have been developed for detecting positively selected amino acid sites. It has been reported that the parsimony method is generally conservative. In contrast, the Bayesian method is known to identify more positively selected sites than the parsimony method, especially when the number of sequences analyzed is small, although the interpretation of results obtained f...
متن کاملBi-Factor Analysis Based on Noise-Reduction (BIFANR): A New Algorithm for Detecting Coevolving Amino Acid Sites in Proteins
Previous statistical analyses have shown that amino acid sites in a protein evolve in a correlated way instead of independently. Even though located distantly in the linear sequence, the coevolved amino acids could be spatially adjacent in the tertiary structure, and constitute specific protein sectors. Moreover, these protein sectors are independent of one another in structure, function, and e...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Bioinformatics
دوره 21 Suppl 1 شماره
صفحات -
تاریخ انتشار 2005